In this competition, we are trying to identify common diseases of cassava crops using data science and machine learning. Previous methods of disease detection require farmers to solicit the help of government-funded agricultural experts to visually inspect and diagnose the plants. This suffers from being labor-intensive, low-supply and costly. Instead, it would be preferred if an automated pipeline based on mobile-quality photos of the cassava leafs could be developed.

This competition provides a farmer-crowdsourced dataset, labeled by experts at the National Crops Resources Research Institute (NaCRRI).

In this kernel, I will present a quick EDA.

import numpy as np
import pandas as pd
import seaborn as sns
import albumentations as A
import matplotlib.pyplot as plt
import os, gc, cv2, random, warnings, math, sys, json, pprint, pdb

import tensorflow as tf
from tensorflow.keras import backend as K
import tensorflow_hub as hub

from sklearn.model_selection import train_test_split

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 
warnings.simplefilter('ignore')

Tip: Adding seed helps reproduce results. Setting debug parameter wil run the model on smaller number of epochs to validate the architecture.
SEED = 16
DEBUG = False #@param {type:"boolean"}

os.environ['PYTHONHASHSEED'] = str(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
Mounted at /content/gdrive
dataset_path = '/content/gdrive/MyDrive/1_AUSTIN CHEN/Data Scientist/Datasets/cassava-leaf-disease-classification'
os.chdir(dataset_path)
os.listdir(dataset_path)
['efficientnetb3_notop.h5',
 'label_num_to_disease_map.json',
 'sample_submission.csv',
 'train.csv',
 'cassava-leaf-disease-classification.zip',
 'test_images',
 'test_tfrecords',
 'train_images',
 'train_tfrecords',
 '.ipynb_checkpoints']
df = pd.read_csv(dataset_path + '/train.csv')
df.head()
image_id label
0 1000015157.jpg 0
1 1000201771.jpg 3
2 100042118.jpg 1
3 1000723321.jpg 1
4 1000812911.jpg 3

Check how many images are available in the training dataset and also check if each item in the training set are unique

print(f"There are {len(df)} train images")
len(df.image_id) == len(df.image_id.unique())
There are 21397 train images
True
(df.label.value_counts(normalize=True) * 100).plot.barh(figsize = (8, 5))
<matplotlib.axes._subplots.AxesSubplot at 0x7fd1f40ee7f0>
df['filename'] = df['image_id'].map(lambda x : dataset_path + '/train_images/' + x)
df = df.drop(columns = ['image_id'])
df = df.sample(frac=1).reset_index(drop=True)
df.head()
label filename
0 3 /content/gdrive/MyDrive/1_AUSTIN CHEN/Data Sci...
1 3 /content/gdrive/MyDrive/1_AUSTIN CHEN/Data Sci...
2 3 /content/gdrive/MyDrive/1_AUSTIN CHEN/Data Sci...
3 3 /content/gdrive/MyDrive/1_AUSTIN CHEN/Data Sci...
4 3 /content/gdrive/MyDrive/1_AUSTIN CHEN/Data Sci...
if DEBUG:
    _, df = train_test_split(
        df,
        test_size = 0.1,
        random_state=SEED,
        shuffle=True,
        stratify=df['label'])
with open(dataset_path + '/label_num_to_disease_map.json') as file:
  id2label = json.loads(file.read())
id2label
{'0': 'Cassava Bacterial Blight (CBB)',
 '1': 'Cassava Brown Streak Disease (CBSD)',
 '2': 'Cassava Green Mottle (CGM)',
 '3': 'Cassava Mosaic Disease (CMD)',
 '4': 'Healthy'}

In this case, we have 5 labels (4 diseases and healthy):

  1. Cassava Bacterial Blight (CBB)
  2. Cassava Brown Streak Disease (CBSD)
  3. Cassava Green Mottle (CGM)
  4. Cassava Mosaic Disease (CMD)
  5. Healthy

In this case label 3, Cassava Mosaic Disease (CMD) is the most common label. This imbalance may have to be addressed with a weighted loss function or oversampling. I might try this in a future iteration of this kernel or in a new kernel.

Let's check an example image to see what it looks like

from PIL import Image
img = Image.open(df[df.label==3]['filename'].iloc[0])
width, height = img.size
print(f"Width: {width}, Height: {height}")
Width: 800, Height: 600
img

Config parameters

From B0 to B7 base model, the input shapes are different. Here is a list of input shpae expected for each model:

Base model resolution
EfficientNetB0 224
EfficientNetB1 240
EfficientNetB2 260
EfficientNetB3 300
EfficientNetB4 380
EfficientNetB5 456
EfficientNetB6 528
EfficientNetB7 600
BASE_MODEL, IMG_SIZE = ("efficientnet_b3", 300) #param ["(\"efficientnet_b4\", 380)", "(\"efficientnet_b2\", 260)"] {type:"raw", allow-input: true}
BATCH_SIZE = 32 #param {type:"integer"}
IMG_SIZE = (IMG_SIZE, IMG_SIZE)
print("Using {} with input size {}".format(BASE_MODEL, IMG_SIZE))
Using efficientnet_b3 with input size (300, 300)

Load data

After my quick and rough EDA, let's load the PIL Image to a Numpy array, so we can move on to data augmentation.

In fastai, they have item_tfms and batch_tfms defined for their data loader API. The item transforms performs a fairly large crop to 224 and also apply other standard augmentations (in aug_tranforms) at the batch level on the GPU. The batch size is set to 32 here.

Split Dataset

Important: Since we are using built-in generator, it takes label as string
 
train_df, valid_df = train_test_split(
    df
    ,test_size = 0.2
    ,random_state = SEED
    ,shuffle = True
    ,stratify = df['label'])
train_ds = tf.data.Dataset.from_tensor_slices(
    (train_df.filename.values,train_df.label.values))
valid_ds = tf.data.Dataset.from_tensor_slices(
    (valid_df.filename.values, valid_df.label.values))
    
adapt_ds = tf.data.Dataset.from_tensor_slices(
    train_df.filename.values)
for x,y in valid_ds.take(3):
  print(x, y)
tf.Tensor(b'/content/gdrive/MyDrive/1_AUSTIN CHEN/Data Scientist/Datasets/cassava-leaf-disease-classification/train_images/2484271873.jpg', shape=(), dtype=string) tf.Tensor(4, shape=(), dtype=int64)
tf.Tensor(b'/content/gdrive/MyDrive/1_AUSTIN CHEN/Data Scientist/Datasets/cassava-leaf-disease-classification/train_images/3704210007.jpg', shape=(), dtype=string) tf.Tensor(4, shape=(), dtype=int64)
tf.Tensor(b'/content/gdrive/MyDrive/1_AUSTIN CHEN/Data Scientist/Datasets/cassava-leaf-disease-classification/train_images/1655615998.jpg', shape=(), dtype=string) tf.Tensor(2, shape=(), dtype=int64)

Data generator

AUTOTUNE = tf.data.experimental.AUTOTUNE

Important: At this point, you may have noticed that I have not used any kind of normalization or rescaling. I recently discovered that there is Normalization layer included in Keras’ pretrained EfficientNet, as mentioned here.
def process_image(filename, label=None):
  img = tf.io.read_file(filename)
  img = tf.image.decode_jpeg(img, channels=3)
  return img, label
  
def process_train(filename, label):
  img, _ = process_image(filename)
  img = tf.image.random_brightness(img, 0.3)
  img = tf.image.random_flip_left_right(img, seed=None)
  img = tf.image.random_crop(img, size=[*IMG_SIZE, 3])
  return img, label

def process_adapt(filename):
  img, _ = process_image(filename)
  img = tf.keras.layers.experimental.preprocessing.Rescaling(1.0 / 255)(img)
  return img

def process_valid(filename, label):
  img, _ = process_image(filename)
  img = tf.image.resize(img, [*IMG_SIZE])
  return img, label
train_ds = train_ds.map(process_train, num_parallel_calls=AUTOTUNE)
valid_ds = valid_ds.map(process_valid, num_parallel_calls=AUTOTUNE)
adapt_ds = adapt_ds.map(process_adapt, num_parallel_calls=AUTOTUNE)
def show_images(ds):
  _,axs = plt.subplots(4,6,figsize=(24,16))
  for ((x, y), ax) in zip(ds.take(24), axs.flatten()):
    ax.imshow(x.numpy().astype(np.uint8))
    ax.set_title(np.argmax(y))
    ax.axis('off')
show_images(train_ds)
show_images(valid_ds)

Improve performance

Note: I was shuffing the validation set which is a bug
train_ds_batch = (train_ds
                  .shuffle(buffer_size=1000)
                  .batch(BATCH_SIZE)
                  .prefetch(buffer_size=AUTOTUNE))

valid_ds_batch = (valid_ds
                  #.shuffle(buffer_size=1000)
                  .batch(BATCH_SIZE*2)
                  .prefetch(buffer_size=AUTOTUNE))

adapt_ds_batch = (adapt_ds
                  .shuffle(buffer_size=1000)
                  .batch(BATCH_SIZE)
                  .prefetch(buffer_size=AUTOTUNE))
image_batch, label_batch = next(iter(train_ds_batch))
plt.figure(figsize=(10, 10))
for i in range(16):
  ax = plt.subplot(4, 4, i + 1)
  plt.imshow(image_batch[i].numpy().astype("uint8"))
  label = label_batch[i].numpy()
  plt.title(label)
  plt.axis("off")

Data augmentation

data_augmentation = tf.keras.Sequential(
    [
      tf.keras.layers.experimental.preprocessing.RandomCrop(*IMG_SIZE),
      tf.keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
      tf.keras.layers.experimental.preprocessing.RandomRotation(0.25),
      tf.keras.layers.experimental.preprocessing.RandomZoom((-0.2, 0)),
      tf.keras.layers.experimental.preprocessing.RandomContrast((0.2,0.2))
    ]
)
plt.figure(figsize=(10, 10))
for i in range(16):
    augmented_images = data_augmentation(image_batch)
    ax = plt.subplot(4, 4, i + 1)
    plt.imshow(augmented_images[i].numpy().astype("uint8"))
    label = label_batch[i].numpy()
    plt.title(label)
    plt.axis("off")

Build model

I am using an EfficientNetB3 on top of which I add some output layers to predict our 5 disease classes. I decided to load the imagenet pretrained weights locally to keep the internet off (part of the requirements to submit a kernal to this competition).

from tensorflow.keras.applications import EfficientNetB3
!wget https://storage.googleapis.com/keras-applications/efficientnetb3_notop.h5
--2020-12-17 00:00:05--  https://storage.googleapis.com/keras-applications/efficientnetb3_notop.h5
Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.9.208, 172.217.12.240, 172.217.164.176, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.9.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43941136 (42M) [application/x-hdf]
Saving to: ‘efficientnetb3_notop.h5’

efficientnetb3_noto 100%[===================>]  41.91M  72.3MB/s    in 0.6s    

2020-12-17 00:00:06 (72.3 MB/s) - ‘efficientnetb3_notop.h5’ saved [43941136/43941136]

efficientnet = EfficientNetB3(
    weights = dataset_path + "/efficientnetb3_notop.h5", 
    include_top = False, 
    input_shape = (*IMG_SIZE, 3), 
    drop_connect_rate = 0.4)
def build_model(base_model, num_class):
  inputs = tf.keras.layers.Input(shape=(*IMG_SIZE, 3))
  x = data_augmentation(inputs)
  model = base_model

  # Freeze the pretrained weights
  model.trainable = False

  # Rebuild top
  x = tf.keras.layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
  x = tf.keras.layers.BatchNormalization()(x)
  x = tf.keras.layers.Dropout(0.4, name="top_dropout")(x)
  outputs = tf.keras.layers.Dense(num_class, activation="softmax", name="pred")(x)

  return model
inputs       = tf.keras.layers.Input(shape=(*IMG_SIZE, 3))
augmented    = data_augmentation(inputs)
efficientnet = efficientnet(augmented)
pooling      = tf.keras.layers.GlobalAveragePooling2D()(efficientnet)
dropout      = tf.keras.layers.Dropout(0.4)(pooling)
outputs      = tf.keras.layers.Dense(len(id2label), activation="softmax")(dropout)
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)
#                   loss='categorical_crossentropy',
#                   metrics = ['categorical_accuracy']):
#    
#    my_model = Sequential()    
#    my_model.add(base_model)
#    my_model.add(GlobalAveragePooling2D())
#    my_model.add(Dense(256))
#    my_model.add(BatchNormalization())
#    my_model.add(Activation('relu'))
#    my_model.add(Dropout(0.3))
#    my_model.add(Dense(5, activation='softmax'))
#    my_model.compile(
#        optimizer=optimizer,
#        loss=CategoricalCrossentropy(label_smoothing=0.05),
#        metrics=metrics
#    )
#    return my_model
model.summary()
Model: "functional_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_9 (InputLayer)         [(None, 300, 300, 3)]     0         
_________________________________________________________________
sequential_4 (Sequential)    (None, 300, 300, 3)       0         
_________________________________________________________________
efficientnetb3 (Functional)  (None, 10, 10, 1536)      10783535  
_________________________________________________________________
global_average_pooling2d_3 ( (None, 1536)              0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 1536)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 7685      
=================================================================
Total params: 10,791,220
Trainable params: 10,703,917
Non-trainable params: 87,303
_________________________________________________________________

The 3rd layer of the Efficient is the Normalization layer, which can be tuned to our new dataset instead of imagenet. Be patient on this one, it does take a bit of time as we're going through the entire training set.

%%time
model.get_layer('efficientnetb3').get_layer('normalization').adapt(adapt_ds_batch)
CPU times: user 13min 17s, sys: 11.6 s, total: 13min 29s
Wall time: 27min 17s
model.save_weights(filepath = dataset_path + "/000_normalization")

I always wanted to try the new CosineDecay function implemented in tf.keras as it seemed promising and I struggled to find the right settings (if there were any) for the ReduceLROnPlateau

EPOCHS = 8
decay_steps = int(round(len(train_df)/BATCH_SIZE)) * EPOCHS
cosine_decay = tf.keras.experimental.CosineDecay(
    initial_learning_rate=1e-4,
    decay_steps=decay_steps,
    alpha=0.3)

callbacks = [
    tf.keras.callbacks.ModelCheckpoint(
        filepath='best_model.h5',
        monitor='val_loss',
        save_best_only=True)
    ]

model.compile(loss="sparse_categorical_crossentropy",
              optimizer=tf.keras.optimizers.Adam(cosine_decay),
              metrics=["accuracy"])
history = model.fit(train_ds_batch,
                    epochs = EPOCHS,
                    validation_data=valid_ds_batch,
                    callbacks=callbacks)
Epoch 1/8
535/535 [==============================] - 1080s 2s/step - loss: 0.7556 - accuracy: 0.7221 - val_loss: 0.6072 - val_accuracy: 0.7759
Epoch 2/8
535/535 [==============================] - 669s 1s/step - loss: 0.5565 - accuracy: 0.8015 - val_loss: 0.6240 - val_accuracy: 0.7643
Epoch 3/8
535/535 [==============================] - 669s 1s/step - loss: 0.5112 - accuracy: 0.8199 - val_loss: 0.6173 - val_accuracy: 0.7659
Epoch 4/8
535/535 [==============================] - 672s 1s/step - loss: 0.4726 - accuracy: 0.8349 - val_loss: 0.5731 - val_accuracy: 0.7825
Epoch 5/8
535/535 [==============================] - 671s 1s/step - loss: 0.4488 - accuracy: 0.8423 - val_loss: 0.5691 - val_accuracy: 0.7886
Epoch 6/8
535/535 [==============================] - 669s 1s/step - loss: 0.4246 - accuracy: 0.8470 - val_loss: 0.5479 - val_accuracy: 0.7995
Epoch 7/8
535/535 [==============================] - 666s 1s/step - loss: 0.4091 - accuracy: 0.8572 - val_loss: 0.5511 - val_accuracy: 0.8005
Epoch 8/8
535/535 [==============================] - 666s 1s/step - loss: 0.3967 - accuracy: 0.8583 - val_loss: 0.5693 - val_accuracy: 0.7942

History

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Loss over epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'valid'], loc='best')
plt.show()

We load the best weight that were kept from the training phase. Just to check how our model is performing, we will attempt predictions over the validation set. This can help to highlight any classes that will be consistently miscategorised.

model.load_weights('best_model.h5')

Predict

def scan_over_image(img_path, crop_size=512):
    '''
    Will extract 512x512 images covering the whole original image
    with some overlap between images
    '''
    
    img = Image.open(img_path)
    img_height, img_width = img.size
    img = np.array(img)
    
    y = random.randint(0,img_height-crop_size)
    x = random.randint(0,img_width-crop_size)

    x_img_origins = [0,img_width-crop_size]
    y_img_origins = [0,img_height-crop_size]
    img_list = []
    for x in x_img_origins:
        for y in y_img_origins:
            img_list.append(img[x:x+crop_size , y:y+crop_size,:])
  
    return np.array(img_list)
def display_samples(img_path):
    '''
    Display all 512x512 images extracted from original images
    '''
    
    img_list = scan_over_image(img_path)
    sample_number = len(img_list)
    fig = plt.figure(figsize = (8,sample_number))
    for i in range(0,sample_number):
        ax = fig.add_subplot(2, 4, i+1)
        ax.imshow(img_list[i])
        ax.set_title(str(i))
    plt.tight_layout()
    plt.show()

1% Better Everyday

https://www.kaggle.com/frlemarchand/efficientnet-aug-tf-keras-for-cassava-diseases https://www.kaggle.com/harveenchadha/efficientnetb3-keras-tf2-baseline-training

todos

  • Find out the intuition and the difference between item_tfm and batch_tfm
  • Customize my own data generator as fastai creates their Dataloader
  • Prepare a special dataset that will be fed to the Normalization layer. The EfficientnetB3 provided by tf.keras includes an out-of-the-box Normalization layer fit onto the imagenet dataset. Therefore, we can pull that layer and use the adapt function to retrain it to the Cassava Disease dataset.
  • The 3rd layer of the Efficientnet is the Normalization layer, which can be tuned to our new dataset instead of imagenet. Be patient on this one, it does take a bit of time we're going through the entire training set.

done

  • Try out the data_generator and the data_frame_iterator
  • Removing normalizaiton step in generator since in EfficientNet, normalization is done within the model itself and the model expects input in the range of [0,255]

Augmentation

The albumentation is primarily used for resizing and normalization.

def albu_transforms_train(data_resize): 
    return A.Compose([
            A.ToFloat(),
            A.Resize(data_resize, data_resize),
        ], p=1.)

# For Validation 
def albu_transforms_valid(data_resize): 
    return A.Compose([
            A.ToFloat(),
            A.Resize(data_resize, data_resize),
        ], p=1.)
def CutMix(image, label, DIM, PROBABILITY = 1.0):
    # input image - is a batch of images of size [n,dim,dim,3] not a single image of [dim,dim,3]
    # output - a batch of images with cutmix applied
    CLASSES = 5
    
    imgs = []; labs = []
    for j in range(len(image)):
        # DO CUTMIX WITH PROBABILITY DEFINED ABOVE
        P = tf.cast( tf.random.uniform([],0,1)<=PROBABILITY, tf.int32)
        
        # CHOOSE RANDOM IMAGE TO CUTMIX WITH
        k = tf.cast( tf.random.uniform([],0,len(image)),tf.int32)
        
        # CHOOSE RANDOM LOCATION
        x = tf.cast( tf.random.uniform([],0,DIM),tf.int32)
        y = tf.cast( tf.random.uniform([],0,DIM),tf.int32)
        
        b = tf.random.uniform([],0,1) # this is beta dist with alpha=1.0
        
        WIDTH = tf.cast( DIM * tf.math.sqrt(1-b),tf.int32) * P
        ya = tf.math.maximum(0,y-WIDTH//2)
        yb = tf.math.minimum(DIM,y+WIDTH//2)
        xa = tf.math.maximum(0,x-WIDTH//2)
        xb = tf.math.minimum(DIM,x+WIDTH//2)

        # MAKE CUTMIX IMAGE
        one = image[j,ya:yb,0:xa,:]
        two = image[k,ya:yb,xa:xb,:]
        three = image[j,ya:yb,xb:DIM,:]
        middle = tf.concat([one,two,three],axis=1)
        img = tf.concat([image[j,0:ya,:,:],middle,image[j,yb:DIM,:,:]],axis=0)
        imgs.append(img)
        
        # MAKE CUTMIX LABEL
        a = tf.cast(WIDTH*WIDTH/DIM/DIM,tf.float32)
        labs.append((1-a)*label[j] + a*label[k])
            
    # RESHAPE HACK SO TPU COMPILER KNOWS SHAPE OF OUTPUT TENSOR (maybe use Python typing instead?)
    image2 = tf.reshape(tf.stack(imgs),(len(image),DIM,DIM,3))
    label2 = tf.reshape(tf.stack(labs),(len(image),CLASSES))
    
    return image2,label2
def MixUp(image, label, DIM, PROBABILITY = 1.0):
    # input image - is a batch of images of size [n,dim,dim,3] not a single image of [dim,dim,3]
    # output - a batch of images with mixup applied
    CLASSES = 5
    
    imgs = []; labs = []
    for j in range(len(image)):
        # DO MIXUP WITH PROBABILITY DEFINED ABOVE
        P = tf.cast( tf.random.uniform([],0,1)<=PROBABILITY, tf.float32)
                   
        # CHOOSE RANDOM
        k = tf.cast( tf.random.uniform([],0,len(image)),tf.int32)
        a = tf.random.uniform([],0,1)*P # this is beta dist with alpha=1.0
                    
        # MAKE MIXUP IMAGE
        img1 = image[j,]
        img2 = image[k,]
        imgs.append((1-a)*img1 + a*img2)
                    
        # MAKE CUTMIX LABEL
        labs.append((1-a)*label[j] + a*label[k])
            
    # RESHAPE HACK SO TPU COMPILER KNOWS SHAPE OF OUTPUT TENSOR (maybe use Python typing instead?)
    image2 = tf.reshape(tf.stack(imgs),(len(image),DIM,DIM,3))
    label2 = tf.reshape(tf.stack(labs),(len(image),CLASSES))
    return image2,label2